interactive policy
Structured Imitation Learning of Interactive Policies through Inverse Games
Generative model-based imitation learning methods have recently achieved strong results in learning high-complexity motor skills from human demonstrations. However, imitation learning of interactive policies that coordinate with humans in shared spaces without explicit communication remains challenging, due to the significantly higher behavioral complexity in multi-agent interactions compared to non-interactive tasks. In this work, we introduce a structured imitation learning framework for interactive policies by combining generative single-agent policy learning with a flexible yet expressive game-theoretic structure. Our method explicitly separates learning into two steps: first, we learn individual behavioral patterns from multi-agent demonstrations using standard imitation learning; then, we structurally learn inter-agent dependencies by solving an inverse game problem. Preliminary results in a synthetic 5-agent social navigation task show that our method significantly improves non-interactive policies and performs comparably to the ground truth interactive policy using only 50 demonstrations. These results highlight the potential of structured imitation learning in interactive settings.
IMTLab: An Open-Source Platform for Building, Evaluating, and Diagnosing Interactive Machine Translation Systems
Huang, Xu, Zhang, Zhirui, Gao, Ruize, Du, Yichao, Liu, Lemao, Huang, Gouping, Shi, Shuming, Chen, Jiajun, Huang, Shujian
We present IMTLab, an open-source end-to-end interactive machine translation (IMT) system platform that enables researchers to quickly build IMT systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems. IMTLab treats the whole interactive translation process as a task-oriented dialogue with a human-in-the-loop setting, in which human interventions can be explicitly incorporated to produce high-quality, error-free translations. To this end, a general communication interface is designed to support the flexible IMT architectures and user policies. Based on the proposed design, we construct a simulated and real interactive environment to achieve end-to-end evaluation and leverage the framework to systematically evaluate previous IMT systems. Our simulated and manual experiments show that the prefix-constrained decoding approach still gains the lowest editing cost in the end-to-end evaluation, while BiTIIMT achieves comparable editing cost with a better interactive experience.
Interactive Concept Bottleneck Models
Chauhan, Kushal, Tiwari, Rishabh, Freyberg, Jan, Shenoy, Pradeep, Dvijotham, Krishnamurthy
Concept bottleneck models (CBMs) are interpretable neural networks that first predict labels for human-interpretable concepts relevant to the prediction task, and then predict the final label based on the concept label predictions. We extend CBMs to interactive prediction settings where the model can query a human collaborator for the label to some concepts. We develop an interaction policy that, at prediction time, chooses which concepts to request a label for so as to maximally improve the final prediction. We demonstrate that a simple policy combining concept prediction uncertainty and influence of the concept on the final prediction achieves strong performance and outperforms static approaches as well as active feature acquisition methods proposed in the literature. We show that the interactive CBM can achieve accuracy gains of 5-10% with only 5 interactions over competitive baselines on the Caltech-UCSD Birds, CheXpert and OAI datasets.
Interaction-Aware Multi-Agent Reinforcement Learning for Mobile Agents with Individual Goals
Mohseni-Kabir, Anahita, Isele, David, Fujimura, Kikuo
-- In a multi-agent setting, the optimal policy of a single agent is largely dependent on the behavior of other agents. We investigate the problem of multi-agent reinforcement learning, focusing on decentralized learning in non-stationary domains for mobile robot navigation. We identify a cause for the difficulty in training non-stationary policies: mutual adaptation to sub-optimal behaviors, and we use this to motivate a curriculum-based strategy for learning interactive policies. The curriculum has two stages. First, the agent leverages policy gradient algorithms to learn a policy that is capable of achieving multiple goals. Second, the agent learns a modifier policy to learn how to interact with other agents in a multi-agent setting. We evaluated our approach on both an autonomous driving lane-change domain and a robot navigation domain. Single agent reinforcement learning (RL) algorithms have made significant progress in game playing [20] and robotics [13], however, single agent learning algorithms in multi-agent settings are prone to learn stereotyped behaviors that over-fit to the training environment [22], [15]. There are several reasons why multi-agent environments are more difficult: 1) interacting with an unknown agent requires having either multiple responses to a given situation or a more nuanced ability to perceive differences. The former breaks the Markov assumption, the latter rules out simpler solutions which are likely to be found first.
Top 8 Healthcare Predictions for 2019
What can you look forward to in healthcare in 2019? The debate expects to get hotter between AI vs. Physicians, Consumer vs. Clinical, Human empathy vs. Machine Intelligence as many new players enter the ecosystem We have been writing the predictions for healthcare every year now for the past 10 years. We also review back how we did each year and each year we are getting to be more accurate. The 2018 predictions that were released in December 2017 were almost 98% accurate and each one of them panned out during the course of the year. Globally, 2019 will be a year of value-based care as we expect the'outcomes-based care' focus to globalize.